Overview

Dataset statistics

Number of variables17
Number of observations3390
Missing cells510
Missing cells (%)0.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory450.4 KiB
Average record size in memory136.0 B

Variable types

Numeric9
Categorical7
Boolean1

Alerts

cigsPerDay is highly overall correlated with is_smokingHigh correlation
sysBP is highly overall correlated with diaBP and 1 other fieldsHigh correlation
diaBP is highly overall correlated with sysBP and 1 other fieldsHigh correlation
glucose is highly overall correlated with diabetesHigh correlation
is_smoking is highly overall correlated with cigsPerDayHigh correlation
prevalentHyp is highly overall correlated with sysBP and 1 other fieldsHigh correlation
diabetes is highly overall correlated with glucoseHigh correlation
BPMeds is highly imbalanced (80.6%)Imbalance
prevalentStroke is highly imbalanced (94.4%)Imbalance
diabetes is highly imbalanced (82.8%)Imbalance
education has 87 (2.6%) missing valuesMissing
BPMeds has 44 (1.3%) missing valuesMissing
totChol has 38 (1.1%) missing valuesMissing
glucose has 304 (9.0%) missing valuesMissing
id is uniformly distributedUniform
id has unique valuesUnique
cigsPerDay has 1703 (50.2%) zerosZeros

Reproduction

Analysis started2024-02-10 02:21:22.666376
Analysis finished2024-02-10 02:21:26.959235
Duration4.29 seconds
Software versionydata-profiling v0.0.dev0
Download configurationconfig.json

Variables

id
Real number (ℝ)

UNIFORM  UNIQUE 

Distinct3390
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1694.5
Minimum0
Maximum3389
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size26.6 KiB
2024-02-09T18:21:26.989184image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile169.45
Q1847.25
median1694.5
Q32541.75
95-th percentile3219.55
Maximum3389
Range3389
Interquartile range (IQR)1694.5

Descriptive statistics

Standard deviation978.75303
Coefficient of variation (CV)0.5776058
Kurtosis-1.2
Mean1694.5
Median Absolute Deviation (MAD)847.5
Skewness0
Sum5744355
Variance957957.5
MonotonicityStrictly increasing
2024-02-09T18:21:27.041441image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1
 
< 0.1%
2277 1
 
< 0.1%
2253 1
 
< 0.1%
2254 1
 
< 0.1%
2255 1
 
< 0.1%
2256 1
 
< 0.1%
2257 1
 
< 0.1%
2258 1
 
< 0.1%
2259 1
 
< 0.1%
2260 1
 
< 0.1%
Other values (3380) 3380
99.7%
ValueCountFrequency (%)
0 1
< 0.1%
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
ValueCountFrequency (%)
3389 1
< 0.1%
3388 1
< 0.1%
3387 1
< 0.1%
3386 1
< 0.1%
3385 1
< 0.1%
3384 1
< 0.1%
3383 1
< 0.1%
3382 1
< 0.1%
3381 1
< 0.1%
3380 1
< 0.1%

age
Real number (ℝ)

Distinct39
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean49.542183
Minimum32
Maximum70
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size26.6 KiB
2024-02-09T18:21:27.088344image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum32
5-th percentile37
Q142
median49
Q356
95-th percentile64
Maximum70
Range38
Interquartile range (IQR)14

Descriptive statistics

Standard deviation8.5928781
Coefficient of variation (CV)0.17344569
Kurtosis-1.0048019
Mean49.542183
Median Absolute Deviation (MAD)7
Skewness0.22579588
Sum167948
Variance73.837553
MonotonicityNot monotonic
2024-02-09T18:21:27.254242image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
40 148
 
4.4%
42 145
 
4.3%
41 144
 
4.2%
46 140
 
4.1%
39 139
 
4.1%
44 135
 
4.0%
48 134
 
4.0%
45 131
 
3.9%
43 127
 
3.7%
38 119
 
3.5%
Other values (29) 2028
59.8%
ValueCountFrequency (%)
32 1
 
< 0.1%
33 4
 
0.1%
34 16
 
0.5%
35 29
 
0.9%
36 75
2.2%
37 73
2.2%
38 119
3.5%
39 139
4.1%
40 148
4.4%
41 144
4.2%
ValueCountFrequency (%)
70 2
 
0.1%
69 5
 
0.1%
68 14
 
0.4%
67 33
 
1.0%
66 30
 
0.9%
65 43
1.3%
64 75
2.2%
63 93
2.7%
62 80
2.4%
61 87
2.6%

education
Categorical

Distinct4
Distinct (%)0.1%
Missing87
Missing (%)2.6%
Memory size26.6 KiB
1.0
1391 
2.0
990 
3.0
549 
4.0
373 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters9909
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.0
2nd row4.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.0 1391
41.0%
2.0 990
29.2%
3.0 549
 
16.2%
4.0 373
 
11.0%
(Missing) 87
 
2.6%

Length

2024-02-09T18:21:27.295930image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-09T18:21:27.333562image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
1.0 1391
42.1%
2.0 990
30.0%
3.0 549
 
16.6%
4.0 373
 
11.3%

Most occurring characters

ValueCountFrequency (%)
. 3303
33.3%
0 3303
33.3%
1 1391
14.0%
2 990
 
10.0%
3 549
 
5.5%
4 373
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6606
66.7%
Other Punctuation 3303
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3303
50.0%
1 1391
21.1%
2 990
 
15.0%
3 549
 
8.3%
4 373
 
5.6%
Other Punctuation
ValueCountFrequency (%)
. 3303
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 9909
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 3303
33.3%
0 3303
33.3%
1 1391
14.0%
2 990
 
10.0%
3 549
 
5.5%
4 373
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9909
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 3303
33.3%
0 3303
33.3%
1 1391
14.0%
2 990
 
10.0%
3 549
 
5.5%
4 373
 
3.8%

sex
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size26.6 KiB
F
1923 
M
1467 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters3390
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowF
2nd rowM
3rd rowF
4th rowM
5th rowF

Common Values

ValueCountFrequency (%)
F 1923
56.7%
M 1467
43.3%

Length

2024-02-09T18:21:27.367154image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-09T18:21:27.405094image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
f 1923
56.7%
m 1467
43.3%

Most occurring characters

ValueCountFrequency (%)
F 1923
56.7%
M 1467
43.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3390
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
F 1923
56.7%
M 1467
43.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 3390
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
F 1923
56.7%
M 1467
43.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3390
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
F 1923
56.7%
M 1467
43.3%

is_smoking
Boolean

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size3.4 KiB
False
1703 
True
1687 
ValueCountFrequency (%)
False 1703
50.2%
True 1687
49.8%
2024-02-09T18:21:27.437841image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

cigsPerDay
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct32
Distinct (%)1.0%
Missing22
Missing (%)0.6%
Infinite0
Infinite (%)0.0%
Mean9.0694774
Minimum0
Maximum70
Zeros1703
Zeros (%)50.2%
Negative0
Negative (%)0.0%
Memory size26.6 KiB
2024-02-09T18:21:27.469221image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q320
95-th percentile30
Maximum70
Range70
Interquartile range (IQR)20

Descriptive statistics

Standard deviation11.879078
Coefficient of variation (CV)1.3097863
Kurtosis0.97552913
Mean9.0694774
Median Absolute Deviation (MAD)0
Skewness1.2230054
Sum30546
Variance141.11249
MonotonicityNot monotonic
2024-02-09T18:21:27.511537image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=32)
ValueCountFrequency (%)
0 1703
50.2%
20 606
 
17.9%
30 176
 
5.2%
15 172
 
5.1%
10 106
 
3.1%
9 104
 
3.1%
5 103
 
3.0%
3 79
 
2.3%
40 62
 
1.8%
1 48
 
1.4%
Other values (22) 209
 
6.2%
ValueCountFrequency (%)
0 1703
50.2%
1 48
 
1.4%
2 17
 
0.5%
3 79
 
2.3%
4 7
 
0.2%
5 103
 
3.0%
6 14
 
0.4%
7 8
 
0.2%
8 10
 
0.3%
9 104
 
3.1%
ValueCountFrequency (%)
70 1
 
< 0.1%
60 8
 
0.2%
50 6
 
0.2%
45 2
 
0.1%
43 42
 
1.2%
40 62
 
1.8%
38 1
 
< 0.1%
35 17
 
0.5%
30 176
5.2%
25 44
 
1.3%

BPMeds
Categorical

IMBALANCE  MISSING 

Distinct2
Distinct (%)0.1%
Missing44
Missing (%)1.3%
Memory size26.6 KiB
0.0
3246 
1.0
 
100

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters10038
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 3246
95.8%
1.0 100
 
2.9%
(Missing) 44
 
1.3%

Length

2024-02-09T18:21:27.552119image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-09T18:21:27.587342image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0 3246
97.0%
1.0 100
 
3.0%

Most occurring characters

ValueCountFrequency (%)
0 6592
65.7%
. 3346
33.3%
1 100
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6692
66.7%
Other Punctuation 3346
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 6592
98.5%
1 100
 
1.5%
Other Punctuation
ValueCountFrequency (%)
. 3346
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 10038
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 6592
65.7%
. 3346
33.3%
1 100
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10038
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 6592
65.7%
. 3346
33.3%
1 100
 
1.0%

prevalentStroke
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size26.6 KiB
0
3368 
1
 
22

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters3390
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 3368
99.4%
1 22
 
0.6%

Length

2024-02-09T18:21:27.616941image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-09T18:21:27.654038image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 3368
99.4%
1 22
 
0.6%

Most occurring characters

ValueCountFrequency (%)
0 3368
99.4%
1 22
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3390
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3368
99.4%
1 22
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Common 3390
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 3368
99.4%
1 22
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3390
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3368
99.4%
1 22
 
0.6%

prevalentHyp
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size26.6 KiB
0
2321 
1
1069 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters3390
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 2321
68.5%
1 1069
31.5%

Length

2024-02-09T18:21:27.685222image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-09T18:21:27.721967image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 2321
68.5%
1 1069
31.5%

Most occurring characters

ValueCountFrequency (%)
0 2321
68.5%
1 1069
31.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3390
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2321
68.5%
1 1069
31.5%

Most occurring scripts

ValueCountFrequency (%)
Common 3390
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2321
68.5%
1 1069
31.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3390
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2321
68.5%
1 1069
31.5%

diabetes
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size26.6 KiB
0
3303 
1
 
87

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters3390
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 3303
97.4%
1 87
 
2.6%

Length

2024-02-09T18:21:27.752586image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-09T18:21:27.790390image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 3303
97.4%
1 87
 
2.6%

Most occurring characters

ValueCountFrequency (%)
0 3303
97.4%
1 87
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3390
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3303
97.4%
1 87
 
2.6%

Most occurring scripts

ValueCountFrequency (%)
Common 3390
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 3303
97.4%
1 87
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3390
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3303
97.4%
1 87
 
2.6%

totChol
Real number (ℝ)

Distinct240
Distinct (%)7.2%
Missing38
Missing (%)1.1%
Infinite0
Infinite (%)0.0%
Mean237.07428
Minimum107
Maximum696
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size26.6 KiB
2024-02-09T18:21:27.826555image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum107
5-th percentile169
Q1206
median234
Q3264
95-th percentile313.45
Maximum696
Range589
Interquartile range (IQR)58

Descriptive statistics

Standard deviation45.24743
Coefficient of variation (CV)0.1908576
Kurtosis4.7813217
Mean237.07428
Median Absolute Deviation (MAD)29
Skewness0.9406357
Sum794673
Variance2047.3299
MonotonicityNot monotonic
2024-02-09T18:21:27.876503image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
240 65
 
1.9%
210 51
 
1.5%
220 48
 
1.4%
260 46
 
1.4%
232 45
 
1.3%
230 43
 
1.3%
225 42
 
1.2%
250 41
 
1.2%
200 41
 
1.2%
270 40
 
1.2%
Other values (230) 2890
85.3%
ValueCountFrequency (%)
107 1
< 0.1%
113 1
< 0.1%
119 1
< 0.1%
124 1
< 0.1%
126 1
< 0.1%
129 1
< 0.1%
133 1
< 0.1%
135 2
0.1%
137 1
< 0.1%
140 1
< 0.1%
ValueCountFrequency (%)
696 1
 
< 0.1%
600 1
 
< 0.1%
464 1
 
< 0.1%
453 1
 
< 0.1%
439 1
 
< 0.1%
432 1
 
< 0.1%
410 3
0.1%
398 1
 
< 0.1%
392 1
 
< 0.1%
391 2
0.1%

sysBP
Real number (ℝ)

Distinct226
Distinct (%)6.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean132.60118
Minimum83.5
Maximum295
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size26.6 KiB
2024-02-09T18:21:27.930888image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum83.5
5-th percentile104
Q1117
median128.5
Q3144
95-th percentile175.275
Maximum295
Range211.5
Interquartile range (IQR)27

Descriptive statistics

Standard deviation22.29203
Coefficient of variation (CV)0.16811336
Kurtosis2.3659225
Mean132.60118
Median Absolute Deviation (MAD)13.5
Skewness1.1758367
Sum449518
Variance496.93458
MonotonicityNot monotonic
2024-02-09T18:21:27.977266image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
110 87
 
2.6%
120 85
 
2.5%
130 85
 
2.5%
125 69
 
2.0%
115 68
 
2.0%
124 63
 
1.9%
135 61
 
1.8%
128 61
 
1.8%
126 59
 
1.7%
122 58
 
1.7%
Other values (216) 2694
79.5%
ValueCountFrequency (%)
83.5 2
 
0.1%
85 1
 
< 0.1%
85.5 1
 
< 0.1%
90 2
 
0.1%
92.5 1
 
< 0.1%
93 2
 
0.1%
93.5 2
 
0.1%
94 2
 
0.1%
95 5
0.1%
95.5 1
 
< 0.1%
ValueCountFrequency (%)
295 1
 
< 0.1%
248 1
 
< 0.1%
244 1
 
< 0.1%
243 1
 
< 0.1%
235 1
 
< 0.1%
232 1
 
< 0.1%
230 1
 
< 0.1%
220 2
0.1%
217 1
 
< 0.1%
215 3
0.1%

diaBP
Real number (ℝ)

Distinct142
Distinct (%)4.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean82.883038
Minimum48
Maximum142.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size26.6 KiB
2024-02-09T18:21:28.024924image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum48
5-th percentile66
Q174.5
median82
Q390
95-th percentile105
Maximum142.5
Range94.5
Interquartile range (IQR)15.5

Descriptive statistics

Standard deviation12.023581
Coefficient of variation (CV)0.14506685
Kurtosis1.273995
Mean82.883038
Median Absolute Deviation (MAD)8
Skewness0.71817267
Sum280973.5
Variance144.5665
MonotonicityNot monotonic
2024-02-09T18:21:28.068503image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
80 213
 
6.3%
82 123
 
3.6%
70 109
 
3.2%
85 107
 
3.2%
90 100
 
2.9%
87 96
 
2.8%
81 94
 
2.8%
84 94
 
2.8%
75 90
 
2.7%
78 89
 
2.6%
Other values (132) 2275
67.1%
ValueCountFrequency (%)
48 1
 
< 0.1%
50 1
 
< 0.1%
51 1
 
< 0.1%
52 2
 
0.1%
53 1
 
< 0.1%
54 1
 
< 0.1%
55 2
 
0.1%
56 2
 
0.1%
57 5
0.1%
57.5 2
 
0.1%
ValueCountFrequency (%)
142.5 1
 
< 0.1%
136 2
 
0.1%
135 2
 
0.1%
133 2
 
0.1%
130 5
0.1%
129 1
 
< 0.1%
128 1
 
< 0.1%
127.5 1
 
< 0.1%
125 3
0.1%
124.5 1
 
< 0.1%

BMI
Real number (ℝ)

Distinct1259
Distinct (%)37.3%
Missing14
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean25.794964
Minimum15.96
Maximum56.8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size26.6 KiB
2024-02-09T18:21:28.112527image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum15.96
5-th percentile20.06
Q123.02
median25.38
Q328.04
95-th percentile32.8525
Maximum56.8
Range40.84
Interquartile range (IQR)5.02

Descriptive statistics

Standard deviation4.1154488
Coefficient of variation (CV)0.15954466
Kurtosis2.8346067
Mean25.794964
Median Absolute Deviation (MAD)2.49
Skewness1.022252
Sum87083.8
Variance16.936918
MonotonicityNot monotonic
2024-02-09T18:21:28.155717image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
22.91 17
 
0.5%
22.54 16
 
0.5%
22.19 15
 
0.4%
23.48 12
 
0.4%
25.09 12
 
0.4%
23.1 11
 
0.3%
24.56 11
 
0.3%
23.09 11
 
0.3%
22.73 11
 
0.3%
25.94 11
 
0.3%
Other values (1249) 3249
95.8%
(Missing) 14
 
0.4%
ValueCountFrequency (%)
15.96 1
< 0.1%
16.48 1
< 0.1%
16.59 1
< 0.1%
16.61 1
< 0.1%
16.69 1
< 0.1%
16.71 1
< 0.1%
16.73 1
< 0.1%
16.75 1
< 0.1%
16.92 1
< 0.1%
16.98 1
< 0.1%
ValueCountFrequency (%)
56.8 1
< 0.1%
51.28 1
< 0.1%
45.8 1
< 0.1%
45.79 1
< 0.1%
44.71 1
< 0.1%
44.55 1
< 0.1%
44.27 1
< 0.1%
44.09 1
< 0.1%
43.69 1
< 0.1%
43.67 1
< 0.1%

heartRate
Real number (ℝ)

Distinct68
Distinct (%)2.0%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean75.977279
Minimum45
Maximum143
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size26.6 KiB
2024-02-09T18:21:28.203677image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum45
5-th percentile60
Q168
median75
Q383
95-th percentile98
Maximum143
Range98
Interquartile range (IQR)15

Descriptive statistics

Standard deviation11.971868
Coefficient of variation (CV)0.15757169
Kurtosis0.9796436
Mean75.977279
Median Absolute Deviation (MAD)7
Skewness0.67648972
Sum257487
Variance143.32563
MonotonicityNot monotonic
2024-02-09T18:21:28.245871image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
75 442
 
13.0%
80 298
 
8.8%
70 241
 
7.1%
85 191
 
5.6%
72 184
 
5.4%
60 183
 
5.4%
65 152
 
4.5%
90 143
 
4.2%
68 133
 
3.9%
63 76
 
2.2%
Other values (58) 1346
39.7%
ValueCountFrequency (%)
45 1
 
< 0.1%
47 1
 
< 0.1%
48 4
 
0.1%
50 15
0.4%
51 1
 
< 0.1%
52 12
0.4%
53 8
 
0.2%
54 10
 
0.3%
55 25
0.7%
56 17
0.5%
ValueCountFrequency (%)
143 1
 
< 0.1%
140 1
 
< 0.1%
125 3
 
0.1%
122 2
 
0.1%
120 5
 
0.1%
115 4
 
0.1%
112 3
 
0.1%
110 33
1.0%
108 7
 
0.2%
107 2
 
0.1%

glucose
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct132
Distinct (%)4.3%
Missing304
Missing (%)9.0%
Infinite0
Infinite (%)0.0%
Mean82.08652
Minimum40
Maximum394
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size26.6 KiB
2024-02-09T18:21:28.289061image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum40
5-th percentile62
Q171
median78
Q387
95-th percentile110
Maximum394
Range354
Interquartile range (IQR)16

Descriptive statistics

Standard deviation24.244753
Coefficient of variation (CV)0.29535609
Kurtosis57.356963
Mean82.08652
Median Absolute Deviation (MAD)8
Skewness6.1443897
Sum253319
Variance587.80807
MonotonicityNot monotonic
2024-02-09T18:21:28.331185image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
75 149
 
4.4%
83 135
 
4.0%
70 123
 
3.6%
77 122
 
3.6%
80 118
 
3.5%
78 117
 
3.5%
73 116
 
3.4%
74 114
 
3.4%
85 112
 
3.3%
76 104
 
3.1%
Other values (122) 1876
55.3%
(Missing) 304
 
9.0%
ValueCountFrequency (%)
40 1
 
< 0.1%
43 1
 
< 0.1%
44 2
 
0.1%
45 3
0.1%
47 2
 
0.1%
48 1
 
< 0.1%
50 2
 
0.1%
52 2
 
0.1%
53 5
0.1%
54 3
0.1%
ValueCountFrequency (%)
394 2
0.1%
386 1
< 0.1%
368 1
< 0.1%
348 1
< 0.1%
332 1
< 0.1%
320 1
< 0.1%
297 1
< 0.1%
294 1
< 0.1%
274 1
< 0.1%
270 1
< 0.1%

TenYearCHD
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size26.6 KiB
0
2879 
1
511 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters3390
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 2879
84.9%
1 511
 
15.1%

Length

2024-02-09T18:21:28.374660image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-09T18:21:28.422365image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 2879
84.9%
1 511
 
15.1%

Most occurring characters

ValueCountFrequency (%)
0 2879
84.9%
1 511
 
15.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3390
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2879
84.9%
1 511
 
15.1%

Most occurring scripts

ValueCountFrequency (%)
Common 3390
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2879
84.9%
1 511
 
15.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3390
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2879
84.9%
1 511
 
15.1%

Interactions

2024-02-09T18:21:26.377995image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:23.069881image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:23.578086image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:23.997299image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:24.360708image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:24.716830image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:25.056778image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:25.678776image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:26.040715image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:26.419156image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:23.147879image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:23.627633image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:24.039908image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:24.400798image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:24.755403image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:25.096003image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:25.719176image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:26.079245image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:26.459780image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:23.295064image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:23.683900image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:24.083847image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:24.445031image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:24.793843image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:25.422918image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:25.764030image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:26.117729image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:26.501021image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:23.340810image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:23.727723image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:24.125059image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:24.486473image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:24.838516image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:25.462288image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:25.805908image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:26.156298image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:26.538918image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:23.381416image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:23.767763image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:24.167380image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:24.526439image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:24.875456image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:25.499158image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:25.845419image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:26.194636image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:26.575557image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:23.417783image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:23.806861image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:24.204845image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:24.561803image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:24.909641image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:25.533567image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:25.882739image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:26.229561image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:26.614250image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:23.455008image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:23.843500image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:24.241763image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:24.598087image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:24.944447image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:25.567562image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:25.919700image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:26.263301image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:26.656056image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:23.496557image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:23.896542image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:24.283639image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:24.639881image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:24.983470image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:25.606484image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:25.960688image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:26.302015image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:26.694909image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:23.534738image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:23.934892image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:24.320255image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:24.676432image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:25.018360image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:25.641239image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:25.999352image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2024-02-09T18:21:26.338988image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Correlations

2024-02-09T18:21:28.457759image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
idagecigsPerDaytotCholsysBPdiaBPBMIheartRateglucoseeducationsexis_smokingBPMedsprevalentStrokeprevalentHypdiabetesTenYearCHD
id1.0000.015-0.008-0.0140.019-0.0040.0400.0190.0240.0330.0000.0180.0150.0160.0410.0000.000
age0.0151.000-0.2120.3030.4040.2200.147-0.0050.1110.1480.0180.2220.1340.0500.3010.1000.219
cigsPerDay-0.008-0.2121.000-0.038-0.129-0.104-0.1450.076-0.0970.0420.3410.8460.0000.0000.1150.0290.058
totChol-0.0140.303-0.0381.0000.2200.1800.1530.0820.0360.0180.0770.0520.0780.0000.1540.1070.088
sysBP0.0190.404-0.1290.2201.0000.7730.3320.1660.1160.0780.1030.1370.3010.0490.7150.1380.210
diaBP-0.0040.220-0.1040.1800.7731.0000.3790.1730.0510.0500.0740.1220.2250.0470.6390.0530.147
BMI0.0400.147-0.1450.1530.3320.3791.0000.0630.0730.0860.2090.1640.1210.2230.2930.1060.078
heartRate0.019-0.0050.0760.0820.1660.1730.0631.0000.0950.0300.1210.0700.0910.0000.1500.0340.000
glucose0.0240.111-0.0970.0360.1160.0510.0730.0951.0000.0300.0270.0790.1140.0410.0730.7400.138
education0.0330.1480.0420.0180.0780.0500.0860.0300.0301.0000.1390.0620.0000.0190.0880.0570.076
sex0.0000.0180.3410.0770.1030.0740.2090.1210.0270.1391.0000.2140.0390.0000.0000.0000.082
is_smoking0.0180.2220.8460.0520.1370.1220.1640.0700.0790.0620.2141.0000.0320.0360.1170.0490.029
BPMeds0.0150.1340.0000.0780.3010.2250.1210.0910.1140.0000.0390.0321.0000.1070.2570.0630.084
prevalentStroke0.0160.0500.0000.0000.0490.0470.2230.0000.0410.0190.0000.0360.1071.0000.0650.0000.061
prevalentHyp0.0410.3010.1150.1540.7150.6390.2930.1500.0730.0880.0000.1170.2570.0651.0000.0790.165
diabetes0.0000.1000.0290.1070.1380.0530.1060.0340.7400.0570.0000.0490.0630.0000.0791.0000.100
TenYearCHD0.0000.2190.0580.0880.2100.1470.0780.0000.1380.0760.0820.0290.0840.0610.1650.1001.000

Missing values

2024-02-09T18:21:26.756149image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-02-09T18:21:26.848266image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-02-09T18:21:26.915566image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

idageeducationsexis_smokingcigsPerDayBPMedsprevalentStrokeprevalentHypdiabetestotCholsysBPdiaBPBMIheartRateglucoseTenYearCHD
00642.0FYES3.00.0000221.0148.085.0NaN90.080.01
11364.0MNO0.00.0010212.0168.098.029.7772.075.00
22461.0FYES10.00.0000250.0116.071.020.3588.094.00
33501.0MYES20.00.0010233.0158.088.028.2668.094.01
44641.0FYES30.00.0000241.0136.585.026.4270.077.00
55613.0FNO0.00.0010272.0182.0121.032.8085.065.01
66611.0MNO0.00.0010238.0232.0136.024.8375.079.00
77364.0MYES35.00.0000295.0102.068.028.1560.063.00
88412.0FYES20.0NaN000220.0126.078.020.7086.079.00
99552.0FNO0.00.0010326.0144.081.025.7185.0NaN0
idageeducationsexis_smokingcigsPerDayBPMedsprevalentStrokeprevalentHypdiabetestotCholsysBPdiaBPBMIheartRateglucoseTenYearCHD
33803380561.0FYES20.00.0000240.0125.079.027.3880.082.00
33813381631.0FNO0.00.0000205.0138.071.033.1160.085.01
33823382434.0MNO0.00.0010260.0129.090.025.2970.062.00
33833383573.0FNO0.00.0000210.0131.085.026.5970.077.00
33843384611.0FNO0.00.0010217.0182.086.026.98105.0113.00
33853385601.0FNO0.00.0000261.0123.579.029.2870.0103.00
33863386461.0FNO0.00.0000199.0102.056.021.9680.084.00
33873387443.0MYES3.00.0010352.0164.0119.028.9273.072.01
33883388601.0MNO0.0NaN010191.0167.0105.023.0180.085.00
33893389543.0FNO0.00.0000288.0124.077.029.8879.092.00